Method and arrangement for managing grammar options in a graphical callflow builder

ABSTRACT

A method ( 10 ) in a speech recognition application callflow can include the steps of assigning ( 11 ) an individual option and a pre-built grammar to a same prompt, treating ( 15 ) the individual option as a valid output of the pre-built grammar if the individual option is a potential valid match to a recognition phrase ( 12 ) or an annotation ( 13 ) in the pre-built grammar, and treating ( 14 ) the individual option as an independent grammar from the pre-built grammar if the individual option fails to be a potential valid match to the recognition phrase or the annotation in the pre-built grammar.

BACKGROUND OF THE INVENTION

1. Technical Field

This invention relates to the field of graphical user interfaces andmore particularly to a graphical call flow builder.

2. Description of the Related Art

Systems exist that allow callflow designers to write simple grammaroptions or separately select prebuilt grammar files in graphicalcallflow builders. Some systems are described below. There is no systemthat allows designers who do not have any technical knowledge of speechgrammars to both select a pre-built grammar file and write in the sameelement of a callflow. Furthermore, there is no other system that lets adesigner select a specific output of a prebuilt grammar for specialtreatment in a callflow. The system we describe below overcomes theseproblems.

One such system, as described in U.S. Pat. No. 6,510,411, discusses asimplification of the process of developing call or dialogue flows foruse in an Interactive Voice Response system where three principalaspects of the invention include a task-oriented dialogue model (or taskmodel), a development tool and a dialogue manager. The task model is aframework for describing the application-specific information needed toperform the task. The development tool is an object that interprets auser specified task model and outputs information for a spoken dialoguesystem to perform according to the specified task model. The dialoguemanager is a runtime system that uses output from the development toolin carrying out interactive dialogues to perform the task specifiedaccording to the task model. The dialogue manager conducts the dialogueusing the task model and its built-in knowledge of dialogue management.Plus, generic knowledge of how to conduct a dialogue is separated fromthe specific information to be collected in a particular application. Itis only necessary for the developer to provide the specific informationabout the structure of a task, leaving the specifics of dialoguemanagement to the dialogue manager. This invention describes aform-based method for developing very simple speech applications, anddoes not address at all the use of external grammar files.

Another system, U.S. Pat. No. 6,269,336, discusses a voice browser forinteractive services. A markup language document, as described in theU.S. Pat. No. 6,269,336, includes a dialogue element including aplurality of markup language elements. Each of the plurality of markuplanguage elements is identifiable by at least one markup tag. A stepelement is contained within the dialogue element to define a statewithin the dialogue element. The step element includes a prompt elementand an input element. The prompt element includes an announcement to beread to the user. The input element includes at least one input thatcorresponds to a user input. A method in accordance with the presentinvention includes the steps of creating a markup language documenthaving a plurality of elements, selecting a prompt element, and defininga voice communication in the prompt element to be read to the user. Themethod further includes the steps of selecting an input element anddefining an input variable to store data inputted by the user. Althoughthis invention describes a markup language similar, but not identicalto, VoiceXML, and includes the capacity (like VoiceXML) to refer toeither built-in or external grammars, it does not address the resolutionof specific new options with the contents of existing grammars.

U.S. Pat. No. 6,173,266 discusses a dialogue module that includescomputer readable instructions for accomplishing a predefinedinteractive dialogue task in an interactive speech application. Inresponse to user input, a subset of the plurality of dialogue modulesare selected to accomplish their respective interactive dialogue tasksin the interactive speech application and are interconnected in an orderdefining the callflow of the application, and the application isgenerated. A graphical user interface represents the stored plurality ofdialogue modules as icons in a graphical display in which icons for thesubset of dialogue modules are selected in the graphical display. Inresponse to user input, the icons for the subset of dialogue modules aregraphically interconnected into a graphical representation of the callflow of the interactive speech application, and the interactive speechapplication is generated based upon the graphical representation. Usingthe graphical display, the method further includes associatingconfiguration parameters with specific dialogue modules. Once again,this existing invention describes a graphical callflow builder usingdialogue modules as elements, but does not address the resolution ofspecific new options with the contents of existing grammars.

SUMMARY OF THE INVENTION

Embodiments in accordance with the invention can enable callflowdesigners to work more efficiently with lists of variables in agraphical callflow builder, particularly where users can create theirown variable names. Furthermore, embodiments disclosed herein overcomethe problems described above through the automatic evaluation of optionsadded to prompts in a graphical callflow when the prompt is using one ormore existing grammars. The nature of this evaluation is to determine ifthe added options are present in one or more of the existing grammars.If not present, the added prompts are used as external referents for usein the graphical callflow and become part of a new generated grammar. Ifpresent, the added prompts are only used as external referents for usein the graphical callflow and do not become part of a new generatedgrammar.

In a first aspect of the invention, a method for a speech recognitionapplication callflow can include the steps of placing a prompt into aworkspace for the speech recognition application workflow and attachingat least one among a pre-built grammar and a user-entered individual newoption to the prompt. The pre-built grammars can be selected from alist. The method can further include the step of searching the list ofpre-built grammars for matches to the user-entered individual newoption. If a match exists between the pre-built grammar and theuser-entered individual new option, then the user-entered individual newoption can point to an equivalent pre-built grammar. If a match existsbetween the pre-built grammar and the user-entered individual newoption, then the user-entered individual new option can form a part ofthe list of pre-built grammars.

In a second aspect of the invention, a method in a speech recognitionapplication callflow can include the steps of assigning a individualoption and a pre-built grammar to the same prompt, treating theindividual option as a valid output of the pre-built grammar if theindividual option is a potential valid match to a recognition phrase oran annotation in the pre-built grammar, and treating the individualoption as an independent grammar from the pre-built grammar if theindividual option fails to be a potential valid match to the recognitionphrase or the annotation in the pre-built grammar.

In a third aspect of the invention, a system for managing grammaroptions in a graphical callflow builder can include a memory and aprocessor. The processor can be programmed to place a prompt into aworkspace for the speech recognition application workflow and to attachat least one among a pre-built grammar and a user-entered individual newoption to the prompt.

In a fourth aspect of the invention, a computer program has a pluralityof code sections executable by a machine for causing the machine toperform certain steps as described in the method and systems above.

BRIEF DESCRIPTION OF THE DRAWINGS

There are shown in the drawings embodiments which are presentlypreferred, it being understood, however, that the invention is notlimited to the precise arrangements and instrumentalities shown.

FIG. 1 is a flow diagram illustrating a method in a speech recognitionapplication callflow in accordance with the present invention.

FIG. 2 is an exemplary instantiation of a callflow GUI with system anduser-generated labels for callflow elements in accordance with thepresent invention.

FIGS. 3A and 3B illustrate a callflow element prompt and callflowelement in accordance with the present invention.

FIG. 4 is a portion of an exemplary instantiation of a callflow GUI inaccordance with the present invention.

DETAILED DESCRIPTION OF THE INVENTION

In our proposed system, designers can put a prompt into a workspace,then attach either prebuilt grammars from a list or attach individualnew options, or both. To keep the system as parsimonious as possible,and to prevent potential conflicts between multiple grammars, if theuser combines a prebuilt grammar and any new options, the systemsearches the prebuilt grammar for any matches to the new options,searching both valid utterances and associated annotations. If the newoption exists in the grammar, the ‘new’ option simply points to theequivalent grammar entry. Otherwise, the new option becomes part of agrammar automatically built to hold it, with the entry in the newgrammar having the text of the new option as both the recognition stringand an associated annotation. Thus, without any deep understanding ofthe structure of a speech recognition grammar, callflow designers cancreate or work with grammars with a high degree of flexibility.

Referring to FIG. 1, a high-level flowchart of a method 10 in a speechrecognition application callflow. The method 10 can include the step 11of assigning an individual option and a pre-built grammar to the sameprompt. At decision block 12, if the individual option is a potentialvalid match to a recognition phrase, then the method treats theindividual option as a valid output of the pre-built grammar at step 15.Likewise, at decision block 13, if the individual option is a potentialvalid match to an annotation in the pre-built grammar, then the methodalso treats the individual option as a valid output of the pre-builtgrammar at step 15. If the individual option fails to be a potentialvalid match to the recognition phrase or the annotation in the pre-builtgrammar, then the individual option can be treated as an independentgrammar from the pre-built grammar at step 14.

Referring to FIG. 2, a possible instantiation of a callflow GUI withsystem-and user-generated labels for callflow elements is shown inaccordance with the present invention. In particular, the callflow GUI20 illustrates a reminder system where callflow element 22 welcomes theuser to the system. Callflow element 24 determines a particular dateusing user defined variable “Date”, the value of which will be an outputof the grammar named date.jsgf. Callflow element 26 confirms an entryfor the date. Callflow element 28 determines a time using user definedvariable ‘time’, the value of which will be an output of the grammarnamed time.jsgf. Callflow element 30 then confirms the entry from thetime. Callflow element 32 then prompts the user to record at the toneand callflow element 34 prompts the user to determine if anotherreminder is desired. Note that the prompt in 34 can take as speech inputany valid phrase in the date.jsgf grammar plus ‘Yes’ or ‘No’. Withoutinspection, it is not possible to determine whether ‘Yes’ and/or ‘No’are valid phrases in the date.jsgf grammar. For example, suppose thedesigner has created the callflow shown in FIG. 2, with ‘Yes’ and ‘No’defined as valid responses to the prompt in 34 along with the date.jsgfgrammar. The high-level flowchart of FIG. 1 would then illustrate theactions a system would take in evaluating these options as previouslydescribed above. If no further reminders are to be set, then thecallflow element 36 provides a goodbye greeting.

Assume that a system exists for the graphical building of speechrecognition callflows. A key component of such a system would be aprompt—a request for user input. The prompt could have a symbolicrepresentation similar to the call flow element 29 shown in FIG. 3A.Note that the symbol contains an automatically-generated label(“12345”), prompt text (“This is a prompt”), and a placeholder for agrammar option. Through some well-known means (property sheet,drag-and-drop, etc.), the designer can select from a set of prebuiltgrammars (such as the built-in types in VoiceXML, custom-built grammarsfrom a library, etc.) a grammar for the prompt as shown in FIG. 3B. Thedesigner can also select one or more additional options for recognitionat that prompt.

For example, suppose the designer has created the callflow shown in FIG.4, then determines that there is a need to disambiguate ‘midnight’ as aspecial case if spoken in response to the request for the reminder time.The callflow element 102 of FIG. 4 and the high-level flowchart of FIG.1 would then illustrate the actions a system would take in evaluatingthis new option as previously described above.

While these techniques can be generalized to any code generated from thecallflow, here is an example of a VoiceXML form capable of beingautomatically generated from the information provided in the graphicalcallflow for the Time prompt (assuming that ‘midnight’ was NOT a validinput or annotation or time.jsgf): <form id=”Time”> <field name=”Time”><prompt> <audio src=”Time.wav”> For what time? </audio> </prompt><grammar src=”time.jsgf”/> <grammar>midnight {midnight} </grammar><filled> <if cond=”Time == ’midnight’ ”> <goto next=”#Midnight” /> </if><goto next=”#C0020” /> </filled> </field> </form>

Finally, here is an example of a VoiceXML form capable of beinggenerated from the information provided in the graphical callflow forthe Time prompt, assuming that ‘midnight’ IS a valid input fortime.jsgf, and that the annotation returned for ‘midnight’ is 12:00 AM.<form id=”Time”> <field name=”Time”> <prompt> <audio src=”Time.wav”> Forwhat time? </audio> </prompt> <grammar src=”time.jsgf”/> <filled> <ifcond=”Time == ’1200AM’ ”> <goto next=”#Midnight” /> </if> <gotonext=”#C0020”/> </filled> </field> </form>

Note that in searching the grammar (shown in the list below, using ajsgf grammar as an example, but note that this would be workable for anytype of grammar that includes recognition text and annotations—includingbnf, srcl, SRGS XML, SRGS ABNF, etc.), it could be determined that‘midnight’ was in the grammar, and that the annotation for midnight was‘1200 AM’, which enabled the automatic generation of the <if> statementin the form code above—all capable of being done without any detailedknowledge about the content of the prebuilt grammars on the part of thecallflow designer.

-   -   #JSGF V1.0 iso-8859-1;

grammar time; public <time> = [<starter>] [<at>] <hour> [o clock]{needampm}     [<starter>] [<at>] <hour> <minute> {needampm}    [<starter>] [<at>] <hour> [o clock] <ampm>     [<starter>] [<at>]<hour> <minute> <ampm>     [<starter>] [<at>] half past <hour>{needampm}     [<starter>] [<at>] half past <hour> <ampm>    [<starter>] [<at>] [a] quarter till <hour> {needampm}    [<starter>] [<at>] [a] quarter till <hour> <ampm>     [<starter>][<at>] <minute> till <hour> {needampm}     [<starter>] [<at>] <minute>till <hour> <ampm>     [<starter>] [<at>] <minute> after <hour>{needampm}     [<starter>] [<at>] <minute> after <hour> <ampm>    [<starter>] [<at>] noon {noon}     [<starter>] [<at>] midnight{1200AM}     ; <starter> = set     set time     set remindertime     ;<at> = at     for     ; <hour> = one      two      three      four     five      six      seven      eight      nine      ten   eleven  twelve   ; <minute> = <units>     <teens>     <tens>     <tens><units>    ; <units> = one     two     three     four     five     six    seven     eight     nine     ; <teens> = ten     eleven     twelve    thirteen     fourteen     fifteen     sixteen     seventeen    eighteen     nineteen     ; <tens> = twenty    thirty    forty   fifty    ; <ampm> = AM     PM     ;

It should be understood that the present invention can be realized inhardware, software, or a combination of hardware and software. Thepresent invention can also be realized in a centralized fashion in onecomputer system, or in a distributed fashion where different elementsare spread across several interconnected computer systems. Any kind ofcomputer system or other apparatus adapted for carrying out the methodsdescribed herein is suited. A typical combination of hardware andsoftware can be a general purpose computer system with a computerprogram that, when being loaded and executed, controls the computersystem such that it carries out the methods described herein.

The present invention also can be embedded in a computer programproduct, which comprises all the features enabling the implementation ofthe methods described herein, and which when loaded in a computer systemis able to carry out these methods. Computer program or application inthe present context means any expression, in any language, code ornotation, of a set of instructions intended to cause a system having aninformation processing capability to perform a particular functioneither directly or after either or both of the following: a) conversionto another language, code or notation; b) reproduction in a differentmaterial form.

This invention can be embodied in other forms without departing from thespirit or essential attributes thereof. Accordingly, reference should bemade to the following claims, rather than to the foregoingspecification, as indicating the scope of the invention.

1. A method in a speech recognition application callflow, comprising thesteps of: placing a prompt into a workspace for the speech recognitionapplication workflow; and attaching at least one among a pre-builtgrammar and a user-entered individual new option to the prompt.
 2. Themethod of claim 1, wherein the step of attaching the pre-built grammarcomprises the step of selecting the pre-built grammar from a list. 3.The method of claim 2, wherein the method further comprises the step ofsearching the list of pre-built grammars for matches to the user-enteredindividual new option.
 4. The method of claim 3, wherein if a matchexists between the pre-built grammar and the user-entered individual newoption, then the user-entered individual new option points to anequivalent pre-built grammar.
 5. The method of claim 3, wherein if amatch exists between the pre-built grammar and the user-enteredindividual new option, then the user-entered individual new option formsa part of the list of pre-built grammars.
 6. The method of claim 1,wherein the pre-built grammars are selected from the group comprisingVoiceXML and custom-built grammars from a library.
 7. The method ofclaim 1, wherein the method further comprises the step of enabling acustomized user selective output of the pre-built grammar.
 8. The methodof claim 1, wherein the method supports prototyping without knowledge ofa grammar structure by a user.
 9. The method of claim 3, wherein themethod further comprises the step of feeding the result of the step ofsearching to the pre-defined grammar instead of forming an auxiliarygrammar.
 10. A method in a speech recognition application callflow,comprising the steps of: assigning a individual option and a pre-builtgrammar to a same prompt; treat the individual option as a valid outputof the pre-built grammar if the individual option is a potential validmatch to a recognition phrase or an annotation in the pre-built grammar;and treat the individual option as an independent grammar from thepre-built grammar if the individual option fails to be a potential validmatch to the recognition phrase or the annotation in the pre-builtgrammar.
 11. A system for managing grammar options in a graphicalcallflow builder, comprises: a memory; and a processor programmed toplace a prompt into a workspace for the speech recognition applicationworkflow; and attach at least one among a pre-built grammar and auser-entered individual new option to the prompt.
 12. The system ofclaim 11, wherein the processors of attaches the pre-built grammar byselecting the pre-built grammar from a list.
 13. The system of claim 12,wherein the processor is further programmed to search the list ofpre-built grammars for matches to the user-entered individual newoption.
 14. The system of claim 13, wherein if a match exists betweenthe pre-built grammar and the user-entered individual new option, thenthe user-entered individual new option points to an equivalent pre-builtgrammar.
 15. The system of claim 13, wherein if a match exists betweenthe pre-built grammar and the user-entered individual new option, thenthe user-entered individual new option forms a part of the list ofpre-built grammars.
 16. The system of claim 11, wherein the pre-builtgrammars are selected from the group comprising VoiceXML andcustom-built grammars from a library.
 17. The system of claim 11,wherein the processor is further programmed to further enable acustomized user selective output of the pre-built grammar.
 18. Thesystem of claim 13, wherein the processor is further programmed to feedthe result of the search to the pre-defined grammar instead of formingan auxiliary grammar.
 19. A machine-readable storage, having storedthereon a computer program having a plurality of code sectionsexecutable by a machine for causing the machine to perform the steps ofplacing a prompt into a workspace for the speech recognition applicationworkflow and attaching at least one among a pre-built grammar and auser-entered individual new option to the prompt.
 20. Themachine-readable storage of claim 19, wherein the machine-readablestorage is further programmed to select the pre-built grammar from alist.